20 research outputs found

    Exploring the Boundary Region of Tolerance Rough Sets for Feature Selection

    Get PDF
    Of all of the challenges which face the effective application of computational intelli-gence technologies for pattern recognition, dataset dimensionality is undoubtedly one of the primary impediments. In order for pattern classifiers to be efficient, a dimensionality reduction stage is usually performed prior to classification. Much use has been made of Rough Set Theory for this purpose as it is completely data-driven and no other information is required; most other methods require some additional knowledge. However, traditional rough set-based methods in the literature are restricted to the requirement that all data must be discrete. It is therefore not possible to consider real-valued or noisy data. This is usually addressed by employing a discretisation method, which can result in information loss. This paper proposes a new approach based on the tolerance rough set model, which has the abil-ity to deal with real-valued data whilst simultaneously retaining dataset semantics. More significantly, this paper describes the underlying mechanism for this new approach to utilise the information contained within the boundary region or region of uncertainty. The use of this information can result in the discovery of more compact feature subsets and improved classification accuracy. These results are supported by an experimental evaluation which compares the proposed approach with a number of existing feature selection techniques. Key words: feature selection, attribute reduction, rough sets, classification

    Measures for unsupervised fuzzy-rough feature selection

    Get PDF
    For supervised learning, feature selection algorithms at-tempt to maximise a given function of predictive accuracy. This function usually considers the ability of feature vectors to reflect decision class labels. It is therefore intuitive to re-tain only those features that are related to or lead to these decision classes. However, in unsupervised learning, deci-sion class labels are not provided, which poses questions such as; which features should be retained? and, why not use all of the information? The problem is that not all fea-tures are important. Some of the features may be redundant, and others may be irrelevant and noisy. In this paper, some new fuzzy-rough set-based approaches to unsupervised fea-ture selection are proposed. These approaches require no thresholding or domain information, can operate on real-valued data, and result in a significant reduction in dimen-sionality whilst retaining the semantics of the data. 1

    Finding Fuzzy-rough Reducts with Fuzzy Entropy

    Get PDF
    Abstract—Dataset dimensionality is undoubtedly the single most significant obstacle which exasperates any attempt to apply effective computational intelligence techniques to problem domains. In order to address this problem a technique which re-duces dimensionality is employed prior to the application of any classification learning. Such feature selection (FS) techniques attempt to select a subset of the original features of a dataset which are rich in the most useful information. The benefits can include improved data visualisation and transparency, a reduction in training and utilisation times and potentially, im-proved prediction performance. Methods based on fuzzy-rough set theory have demonstrated this with much success. Such methods have employed the dependency function which is based on the information contained in the lower approximation as an evaluation step in the FS process. This paper presents three novel feature selection techniques employing fuzzy entropy to locate fuzzy-rough reducts. This approach is compared with two other fuzzy-rough feature selection approaches which utilise other measures for the selection of subsets. I

    Fuzzy Entropy-Assisted Fuzzy-Rough Feature Selection

    Get PDF
    Abstract — Feature Selection (FS) is a dimensionality reduction technique that aims to select a subset of the original features of a dataset which offer the most useful information. The benefits of feature selection include improved data visualisation, transparency, reduction in training and utilisation times and improved prediction performance. Methods based on fuzzy-rough set theory (FRFS) have employed the dependency function to guide the process with much success. This paper presents a novel fuzzy-rough FS technique which is guided by fuzzy entropy. The use of this measure in fuzzy-rough feature selection can result in smaller subset sizes than those obtained through FRFS alone, with little loss or even an increase in overall classification accuracy. I
    corecore